13. Chain Rule & Implicit Differentiation

b. Chain Rule

1. Derivative of a Composition

We will often need to take the derivative of a composition of functions. Let's start with a very simple example, the composition of two straight lines.

Suppose \(z=f(y)=4y+2\) and \(y=g(x)=3x+5\). Find the derivative of \(z=(f\circ g)(x)=4(3x+5)+2=12x+22\).

All three are straight lines. So the derivative is the slope, \(m=\dfrac{\text{rise}}{\text{run}}\).
For \(z=4y+2\): \[ \dfrac{dz}{dy}=\dfrac{\Delta z}{\Delta y}=4 \] For \(y=3x+5\): \[ \dfrac{dy}{dx}=\dfrac{\Delta y}{\Delta x}=3 \] For \(z=12x+22\): \[ \dfrac{dz}{dx}=\dfrac{\Delta z}{\Delta x}=12 \] How can we get \(\dfrac{dz}{dx}\) from \(\dfrac{dz}{dy}\) and \(\dfrac{dy}{dx}\)?
Answer: We just multiply them: \[\begin{aligned} \dfrac{dz}{dx}=\dfrac{\Delta z}{\Delta x} =\dfrac{\Delta z}{\Delta y}\dfrac{\Delta y}{\Delta x}=(4)(3)=12 \end{aligned}\] Or: \[ (f\circ g)'(x)=f'(y)g'(x)=(4)(3)=12 \] Or: \[ \dfrac{dz}{dx}=\dfrac{dz}{dy}\dfrac{dy}{dx}=(4)(3)=12 \] So it seems we just multiply derivatives.

What happens if the functions are not just straight lines?

Suppose \(z=f(y)=4y^5+2\) and \(y=g(x)=3x^2+5\). Find the derivative of \(z=(f\circ g)(x)=4(3x^2+5)^5+2\).

We know the derivative we want to get. We differentiate using the Extended Power Rule without much simplification:
For \(z=4(3x^2+5)^5+2\): \[ \dfrac{dz}{dx}=20(3x^2+5)^4(6x) \] What happens if we just multiply derivatives? \[ \dfrac{dz}{dx}=\dfrac{dz}{dy}\dfrac{dy}{dx}=[20y^4][6x] \] The problem is: this formula still involves \(y\) but we want \(\dfrac{dz}{dx}\) to be a function of just \(x\). The solution is to simply replace \(y\) by \(y=g(x)=3x^2+5\): \[\begin{aligned} \dfrac{dz}{dx}&=\left.\dfrac{dz}{dy}\right|_{y=g(x)}\dfrac{dy}{dx} \\ &=\left[20y^4\dfrac{}{}\right]_{y=3x^2+5}[6x]=20(3x^2+5)^4(6x) \end{aligned}\] This agrees with our direct computation!
In prime notation, this is: \[\begin{aligned} (f\circ g)'(x)&=f'(g(x))g'(x) \\ &=\left[20y^4\dfrac{}{}\right]_{y=3x^2+5}(6x)=20(3x^2+5)^4(6x) \end{aligned}\]

If \[ p(x)=(f\circ g)(x)=f(g(x)) \] then \[ p'(x)=f'(g(x)) g'(x). \] In terms of differentials this is \[ \dfrac{dp}{dx}=\left.\dfrac{df}{du}\right|_{u=g(x)}\dfrac{dg}{dx} \] Equivalently, if \(z=z(y)\) and \(y=y(x)\), then: \[ \dfrac{dz}{dx}=\left.\dfrac{dz}{dy}\right|_{y=y(x)}\dfrac{dy}{dx} \] “The derivative of a composition is the derivative of the outer function evaluated at the inner function times the derivative of the inner function.”

We combine the two differentials \[ dz=\dfrac{dz}{dy}\,dy \qquad \text{and} \qquad dy=\dfrac{dy}{dx}\,dx \] to get \[ dz=\dfrac{dz}{dy}\dfrac{dy}{dx}\,dx \] and compare this to \[ dz=\dfrac{dz}{dx}\,dx \] to conclude \[ \dfrac{dz}{dx}=\dfrac{dz}{dy}\dfrac{dy}{dx}. \]

Let \(p(x)=(f\circ g)(x)=f(g(x))\). We will compute \(p'(a)\). Then \[\begin{aligned} p'(a)&=\lim_{h\rightarrow 0} \dfrac{p(a+h)-p(a)}{h} \\ &=\lim_{h\rightarrow 0} \dfrac{f(g(a+h))-f(g(a))}{h} \\ \end{aligned}\] Now assume \(g'(a)\ne0\). (The case \(g'(a)=0\) is beyond the level of this class.) Then the linear approximation says \[ g(a+h)\approx g(a)+g'(a)h \] and this approximation gets better as \(h\rightarrow 0\). Now let \(k=g(a+h)-g(a)\). Then by the linear approximation \(k\approx g'(a)h\). So \(h\rightarrow 0\) if and only if \(k\rightarrow 0\). In the limit formula for the derivative, we multiply and divide by \(g(a+h)-g(a)\), split the limit as a product of two limits and re-express the first limit in terms of \(k\): \[\begin{aligned} p'(a)&=\lim_{h\rightarrow 0} \left[\dfrac{f(g(a+h))-f(g(a))}{g(a+h)-g(a)}\right] \left[\dfrac{g(a+h)-g(a)}{h}\right] \\ &=\lim_{h\rightarrow 0}\dfrac{f(g(a+h))-f(g(a))}{g(a+h)-g(a)} \lim_{h\rightarrow 0}\dfrac{g(a+h)-g(a)}{h} \\ &=\lim_{k\rightarrow 0}\dfrac{f(g(a)+k)-f(g(a))}{k} \lim_{h\rightarrow 0}\dfrac{g(a+h)-g(a)}{h} \end{aligned}\] The first limit is \(f'(g(a))\). The second limit is \(g'(a)\). So \[ p'(a)=f'(g(a)) g'(a). \] Finally, replace \(a\) by \(x\).

Usually, we do not write the evaluation, \[ \dfrac{dz}{dx}=\dfrac{dz}{dy}\dfrac{dy}{dx} \] but it is still there. Further, it is OK to think that we are cancelling the \(dy\)'s.

Also notice that the Extended Power Rule is just the special case of the Chain Rule where the the outer function is a power. In particular, if \(z=y^n\) and \(y=g(x)\), then \(z=g(x)^n\) and: \[\begin{aligned} \dfrac{dz}{dx}&=\left.\dfrac{dz}{dy}\right|_{y=g(x)}\dfrac{dy}{dx} \\ &=\left.ny^{n-1}\right|_{y=g(x)}\dfrac{dy}{dx} =ng(x)^{n-1}\dfrac{dg}{dx} \end{aligned}\] which is the Extended Power Rule.

Find \(\dfrac{dy}{dt}\) if \(y=(2t^3+5)^2\).

First let \(y=x^2\) and \(x=2t^3+5\). By the Chain Rule we need to compute \(\dfrac{dy}{dt}=\dfrac{dy}{dx}\dfrac{dx}{dt}\). We first find the derivatives: \[ \dfrac{dy}{dx}=2x \qquad \dfrac{dx}{dt}=6t^2 \] We then plug these values into the formula for the Chain Rule: \[ \dfrac{dy}{dt}=(2x)(6t^2) \] Then we remember that the derivative of the outer function needs to be evaluated at the inner functions. \[\begin{aligned} \dfrac{dy}{dt} &=2(2t^3+5)(6t^2) \\ &=24t^5+60t^2 \end{aligned}\]

Since we have complete information, that is, we know the formulas for all the functions involved, we can alternately compute the composition and take its derivative directly: \[\begin{aligned} y(x(t)) &=[x(t)]^2=(2t^3+5)^2=4t^6+20t^3+25 \\ \dfrac{dy}{dt} &=24t^5+60t^2 \end{aligned}\] This is simpler only if the exponent is small and only possible when we have complete information.

Find \(\dfrac{dy}{dx}\) if \(y=\sin(8x+3)\).

\(\dfrac{dy}{dx}=8\cos(8x+3)\)

We split the function as \(y=\sin(u)\) where \(u=8x+3\). The Chain Rule gives \(\dfrac{dy}{dx}=\dfrac{dy}{du}\dfrac{du}{dx}\). We first find the derivatives: \[ \dfrac{dy}{du}=\cos(u) \qquad \dfrac{du}{dx}=8 \] We then plug these values into the formula for the Chain Rule: \[ \dfrac{dy}{dx}=\cos(u)(8) \] Finally, we remember that the derivative of the outer function needs to be evaluated at the inner function. \[\begin{aligned} \dfrac{dy}{dx} &=\cos(8x+3)(8) \\ &=8\cos(8x+3) \end{aligned}\]

In practice, we do not separately write out the inner and outer functions and their derivatives. We do it all at once. For example to differentiate \(f(x)=\tan(3x^2+5x)\), we identify the outer function as \(\tan u\), (where we mentally take \(u\) as our intermediate variable). We write down its derivative which is \(\sec^2 u\), but immediately evaluate it at the inner function \(u=3x^2+5x\) and finally multiply by the derivative of the inner function which is \(6x+5\). Thus we do it all in one line: \[ f'(x)=\sec^2(3x^2+5x)\cdot(6x+5) \]

If \(f(x)=e^{\sin x}\), find \(f'(x)\). Try to write the derivative in a single step.

\(f'(x)=e^{\sin x}\cos x\)

Recall the derivative of \(e^u\) is \(e^u\), but here we evaluate \(u\) as \(u=\sin x\). Then we multiply by the derivative of \(\sin x\) which is \(\cos x\): \[ f'(x)=e^{\sin x}\cos x \]

Compute the derivative of \(g(t)=\sec(t^2+3t)\)

\(g'(t)=(2t+3)\sec(t^2+3t)\tan(t^2+3t)\)

The derivative of \(\sec(u)\) is \(\sec(u)\tan(u)\) which we immediately evaluate at \(u=t^2+3t\). Then we multiply by the derivative of \(u=t^2+3t\) which is \(2t+3\): \[ g'(t)=\sec(t^2+3t)\tan(t^2+3t)(2t+3) \] To prevent confusion about what is the argument of \(\tan\), we usually write the extra polynomial at the beginning: \[ g'(t)=(2t+3)\sec(t^2+3t)\tan(t^2+3t) \] However, remember that the derivative of the inner function is computed last!

© MYMathApps

Supported in part by NSF Grant #1123255